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ABSTRACT 

This report presents the results of a two-part project The first part presents results 
of perfo rmanc e assessment tests on an Internet Library Information Assembly Data 
Base(ILIAD). It was found that ILIAD performed best when queries were short( one-to- 
three keywords), and were made up of rare, unambiguous words. In such cases as many as 
64% of the typically 25 returned documents were found to be relevant It was also found 
that a query format that was not so rigid with respect to spelling errors and punctuation 
marks would be more user-friendly. 

The second part of the report shows the design of a Kalman Filter for estimating 
motion parameters of a three dimensional object from sequences of noisy data derived 
from two- dimensi onal pictures. Given six measured deviation values representing X, Y, Z, 
pitch, yaw, and roll, twelve parameters were estimated comprising the six deviations and 
their time rate of change. Values for the state transition matrix, the observation matrix, the 
system noise covariance matrix, and the observation noise covariance matrix were 
determined. A simple way of initilizing the error covariance matrix was pointed out 
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INTRODUCTION 


A two-part project was undertaken. For the first part, some tests were performed 
on ILIAD( Internet Library Information Assembly Data Base)[l] in order to assess its 
performance, user-friendliness, and to develop some insight into the search technique 
utilized by ILIAD. For the second project, A Kalman Filter was designed for estimating a 
rigid body’s motion parameters from noisy images. The ILIAD tests will be describe first, 
followed by the Kalman Filter Design. 

THE ILIAD TESTS 

Introduction. 

ILIAD, which is an Internet-based information search and retrieval system with a 
built-in intelligent agent, was designed and implemented, at the NASA Johnson Space 
Center, by the Client Server Branch of the Information Systems Directorate. It was 
designed as an intelligent data base primarily to serve K-12 teachers. In operation, a 
teacher sends a “query”, made up of key words, to ILIAD, via electronic mail(e-mail). 
ILIAD uses the key words to search the Internet for documents whose contents deal with 
the subject matter represented by the key words. The entire contents of these documents 
are sent, by e-mail, to the person who sent the query. 

It must be emphasized, that apart from the tests described here, the designers of 
ILIAD performed their own series of tests. What is reported here is, for the most part, the 
result of the author’s tests. Although these tests are not comprehensive, their value lies in 
the fact, that they provide a sense of how well ILIAD is working — more than thirty 
queries were generated and sent by the author — , and help point to some of the issues that 
must be addressed in any future systematic testing of, and improvements upon, ILIAD. 
The queries employed dealt with subject matter that the author was interested in. 

Testing Procedure: 

The testing process comprised the following: 

1. Develop a series of queries spanning a variety of subject areas. 

2. Send queries to ILIAD. 

3. Observe simplicity/complexity of query submission and response. 

4. Examine ILIAD responses, noting: 

a) Response Time. 

b) Relevance of Returned Documents. 

5. Select a few representative queries and the corresponding returned documents. 

6. Examine these documents in detail with respect to: 

a) Ratio of relevant versus non-relevant documents 

b) Choice of words or terminology and the relevance of returned documents, 
b) ILIAD’s relevance criteria versus query submitter’s relevance criteria: 

— “Key words search” or “Key phrases” search? 
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RESULTS AND DISCUSSION. 


Query Format, Simplicity And User Friendliness. 

The author found the query format and ILIAD acknowledgements to be quite 
simple and user-friendly. The user only needs to type in the following: 

♦TeacherEmail: the user’s actual email is typed here. 

*VolumeLevel: type in here, low, medium, or high, to indicate amount of materials desired. 
*?Q1: actual query key words are typed in here. 

Problems arise only when the user is careless in typing in the required items in the 
exact format specified. This would typically be in the form of typographical errors, 
ommisions, and incorrect cases(upper case or lower case) for characters. Although, it has 
been pointed out that most email systems have spell checkers, rejection of queries by 
ILIAD due to incorrect formats could be a source of frustration for teachers, many of 
whom may be in a hurry to squeeze in some information retrieval activity among their daily 
busy schedule. 

Upon the author’s recommendations, the ILIAD designers have implemented 
changes that make ILIAD no longer rigidly sensitive to spelling errors and innocuous 
characters like spaces and commas. 


ILIAD Response Time And Relevance Of Returned Documents. 

When a query was sent, ILIAD first acknowledged receipt of the query within a 
minute. The documents themselves were received within one to two hours. 

More than thirty queries were sbmitted. These dealt with a variety of subject areas 
including technology, science, government, history, and ancient architecture. Some 
representative queries were: “parallel processing’’, “genetic algorithms”, “intelligent 
robots”, “fuzzy logic”, “microprocessors microcontrollers”, “wavelets”, “European 
Economic Community”, and “ Egyptian Pyramids”. Most of the documents returned in 
response to these queries were found to be very relevant However, documents returned in 
response to queries like: “Microsoft Corporation”, “wavelets communication”, “volcano 
eruptions Africa”, were not particularly relevant More on that later, under the section on 
“Key words search” or “Key phrases search”. 


Detailed Results Of Some Queries. 

Shown below is a tabulation of the details of six queries. 


Query# 


Query 


1 


parallel processing 


2 intelligent robots 

3. wavelets 


4. 


Airbus Consortium 


# Of Docs. 
25 

25 

25 

25 


# Docs. Relv. 
10 

12 

16 

0 


% Docs. Relv. 
40% 

48% 

64% 

0% 
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5. 


wavelet communications 


25 


<5 


< 20 % 


6. microprocessors 25 >15 >60% 

Key Word Choice And Relevance Of Returned Documents. 

The data shown above, together with other observed results, would seem to 
indicate that the ideal query would be a one word query where that single word does not 
find usage in many different applications and contexts. For example, the query “wavelet” 
produces a high percentage of relevant documents(64%), whereas the query “Airbus 
Consortium” produces a low percentage of relevant documents( 0%)! Although the term, 
“Airbus” generally refers to the model name of the airplane manufactured by the European 
Airbus Industries Inc., the term “Consortium” has such widespread usage, that ILIAD 
found lots of documents describing consortiums that had nothing to do with the Airbus 
Industries of Europe. 

“Key Words” Or “Key Phrases” Search? 

ILIAD uses the WAIS(Wide Area Information Server)[2] package developed at 
Thinking Machines Corporation, Cambridge, Massachusetts. In his paper entitled, “ 
Massively Parallel Information Retriever for Wide Area Information Servers”, Craig 
Stanfill of Thinking Machines Corporation refers to the way in which WAIS treats the 
query thus: [3] 

“ Queries consist of short natural language phrases, such as ‘Corazon Aquino and the 
Philippine Election’. Each phrase is broken into primitive components such as ‘Corazon 
Aquino,’ and ‘Philippine Election,’ and each component is assigned a numerical weight 
with rare(i.e. more specific)terms assigned higher value. The documents are then ranked 
from highest to lowest, and the best matches presented to the user.” 

The author has not been able to assess how efficiently WAIS does this weight 
assignment in favor of ‘rare terms’, but in the case of the quety “wavelet communication”- 
-communication, meaning information transmission in an engineering context—, the term 
‘communication’ must not have been assigned a small enough weighting, because too many 
documents containing only the word ‘communication’ (within a non-engineering context), 
and without the word ‘wavelet’, were returned by ILIAD. A higher percentage of relevant 
documents would result if WAIS did indeed search for the ocurrences of the whole phrase 
representing the key words. Alternatively, a more stringent assignment of weighting in 
favor of rare terms, so that the word ‘communication’ would be recognized as an everyday 
word , and therefore should be assigned a very low weighting wherever it occurs alone. 


KALMAN FILTER DESIGN 


Introduction. 

The second project involved the design of a Kalman filter for estimating motion 
parameters of a three dimensional body from a sequence of two dimensional ima ges. 
Geometrical techniques had been used by Jodi Seaborn and Robert Goode of ER4 to obtain 
measured values of the object’s center coordinates, X,Y,Z, as well as the pitch, roll, and 
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yaw, as determined from at least three image corresponding points on the object and on the 
two-dimensional picture. [4] The idea is to use the Kalman filter to estimate these 
parameters in addition to their time rates of change. 

System Modeling, Determining The State Transition Matrix[5,6,]. 

The object to be tracked can be modelled by the state equations 

xik + 1) = <fct(k) + u(k) (1) 

where 


X(k) 

X(k) 

Y(k) 

Y(k) 

Z(k) 



X(U 
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~X — deviation 
X — rate 

Y - deviation 

Y - rate 

Z - deviation 
Z - rate 

pitch - deviation 
pitch - rate 

yaw - deviation 
yaw -rate 

roll - deviation 
roll — rate 

Assume an object being tracked is at coordinates X'+X(k), Y'+Y(k), Z’+Z(k), 

A + a(k), B + p(k), X + %(fc), at time k, and at coordinates X'+X(k+1 ), 

Y’+Y(k+1 ),Z'+Z(k+1 ), A + a(£ + l),B + |3(£ + l),X + x(£ + l),attime£+/, T seconds 
later. “T” represents the time spacing between two successive two dimensional still images 
taken by a camera of the moving object We are interested in estimating these linear and 
angular deviations and their rates, which are assumed to be statistically random with zero- 
mean values. 

To a first approximation, if the object is moving at velocities, or rates, given by 
X(k), Y (k), Z(k),a(k), P(k),%(k) , and T is not too large, then considering, for example, 


the Z coordinate, we have 

Z(k + 1) = Z(k) + TZ(k) (2) 

which is an example of a “deviation equation”. 

Similarly, considering the acceleration u(k), we have 

Tu(k) = Z(k + l)-Z(k) (3) 


which is the “acceleration equation”. Assuming that u(k) is a zoo-mean, stationary white 
noise process, the acceleration is, on the average, zero and uncarrelated between intervals, 

E[u(k+l)u(k)]=0, but it has some variance £{w 2 (*)] = (T^ • Such accelerations could be 

caused by short term irregularities in external influences on the object The quantity 
U 3 (k) = Tu{k) is also a white noise process, and therefore the acceleration equation for 
the coordinate Z, can be written as 


Z(* + 1) = Z(*) + M3 (*) 


(4) 



The complete set of range/bearing and acceleration equations for the twelve parameters 
are: 

X(k + l) = X(k) + TX(k) 

X(k + l) = X(k)+ Ul 

Y(k + l) = Y(k) + TY{k) 

Y(k + 1) = TO + M2 
Z(k + 1) = Z(£) + TZ(k) 

Z(k + l) = Z(k)+ Ui ^ 

a(k + 1) = a(&) + Ta(jfc) 

a(k + l)=d(k) + it 4 

p(* + l) = p(*) + Tp 
P(* + 1) = P(*)+K S 
XC k + l)=x(k)+T% 

X(k + l) = 'X(k)+u 6 


'%(*>' 


X - acceleration 

W 2 (*) 


Y - acceleration 

Wj(*) 


Z - acceleration 

u 4 (« 


pitch - acceleration 

Mj(*) 


yaw - acceleration 

w 6 (*)_ 


roll - acceleration 


between time k and k+1 


The complete state equation for the system is given by 


'*,(* + !)' 


'1 

T 

0 

0 

0 

0 

0 

0 

0 

0 

0 

O' 

'*,(*)■ 


' 0 ' 

jc 2 (*+D 


0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

X 2 (k) 


Ui<k) 

JCaC^ + D 


0 

0 

1 

T 

0 

0 

0 

0 

0 

0 

0 

0 

JC 3 (*> 


0 

X.(* + D 


0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

Jt4<*> 


U 2 (k) 

Jts(* + D 


0 

0 

0 

0 

1 

T 

0 

0 

0 

0 

0 

0 

Jts<*> 


0 

jc«(*+D 


0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

*«<*) 

4 _ 

U*(k) 



0 

0 

0 

0 

0 

0 

1 

T 

0 

0 

0 

0 

Jt 7 (*) 


0 

*•<* + !) 


0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 



uA k ) 

*,(* + *> 


0 

0 

0 

0 

0 

0 

0 

0 

1 

T 

0 

0 

JC*(*) 


0 



0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

Xio(k) 


U 5 (k) 



0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

T 

Xu (*) 


0 

_Jti 2 (* + 1 ). 


_0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

X l2 (k) 


JUik) 
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The measured data( the six motion parameters) are assumed to be noisy, and are modelled 
thus: 


y 2 (k)=x 3 ( k ) + v 2 (k) 

y 3 (k) = Xs( k ) + Vi(k) 

y A (k) = x 1 (k)+V*(k) 
y s (k) = x 9 (k) + V s (k) 
y 6 (k) = Xn(k) + V 6 


X- deviation 
Y - deviation 
Z - deviation 
pitch - deviation 
yaw - deviation 
roll - deviation 


( 7 ) 


Therefore, the data vector can be written as: 


y i (k) 

y 2 (k) 

y 3 a) 

y 4 (k) 

y s (k) 

y 6 «) 


1 00000000 
001000000 
000010000 
000000100 
000000001 
000000000 



Xi(k) 



Xi(k) 



X 3 (k) 


0 0 0' 

X<(k) 



0 0 0 

X s (k) 


V 2 (*) 

0 0 0 

X 6 (k) 

4. 

v 3 (*) 

0 0 0 

X 7 Ctt 

T 

VAk) 

0 0 0 

X»(k) 


Vs (*> 

0 1 0_ 

X 9 (k) 


v 6 (k) 


JCxo <*) 



Xn W 



.Xn(k) 



In terms of vector formulation, the two vector equations representing the system 
and measurement models are: 

x(k + l) = <!>(k)x(k) + u{k) ^ 

y(k) = H(k)x(k) + v(k) () 
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= state transition matrix 



'1 

T 

0 

0 

0 

0 

0 

0 

0 

0 


0 

1 

0 

0 

0 

0 

0 

0 

0 

0 


0 

0 

1 

T 

0 

0 

0 

0 

0 

0 


0 

0 

0 

1 

0 

0 

0 

0 

0 

0 


0 

0 

0 

0 

1 

T 

0 

0 

0 

0 

<h = 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

T 

0 

0 


0 

0 

0 

0 

0 

0 

0 

1 

0 

0 


0 

0 

0 

0 

0 

0 

0 

0 

1 

T 


0 

0 

0 

0 

0 

0 

0 

0 

0 

1 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


_0 

0 

0 

0 

0 

0 

0 

0 

0 

0 



'1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

O' 


0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

H = 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 


0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 


0 O' 
0 0 
0 0 
0 0 
0 0 
0 0 
0 0 
0 0 
0 0 
0 0 

1 T 
0 1 


is the observation matrix. 

The additive noise, v(k), is usually assumed to be Gaussian with zero-mean and variances 

Gz<*)> Gi(*x gJ(*). g*(*)- 

The next step is to formulate noise covariance matrix Q for the system, and R for the 
measurement model 

Since we are assuming there is no correlation between noise processes, either in the 
case of system noise processes, or in the measurement noise processes, the off-diagonal 
terms of both the observation noise covariance matrix, R, and the system noise covariance 
matrix, Q, are all zero. 

For the measurement model, the noise covariance matrix is given by 


/?(*) = £[v(*)v r (*)] = 


o*(*) 0 

o 2 r (k) 


0 

0 

0 

0 

0 


0 

0 

0 

0 


0 

0 

o\(k) 

0 

0 

0 


0 

0 

0 

<(k) 

0 

0 


0 

0 

0 

0 

a }<*) 
0 


0 

0 

0 

0 

0 

c\(k) 


•GO) 
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and the system noise covariance matrix is, for this case, given by 


Q(k) = E[u(k) U T (k)] = 


0 0 0 0 0 

0 G? 0 0 0 

0 0 0 0 0 

0 0 o cl 0 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 


0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

0 0 0 0 0 

Gj 0 0 0 0 

0 0 0 0 0 

0 0 g \ 0 0 

0 0 0 0 0 

0 0 0 0 Gs 

0 0 0 0 0 

0 0 0 0 0 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


0 ' 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 


..( 11 ) 


o 


:j 


where 

C ? = £(%)> ol=E(ul), o]=E(ul), G 2 4 = E(u 4 ), G 2 s = E(i£), o\ = E( u \). 
are the variances of T times the linear and angular accelerations respectively. 

Specific values must be substituted for those variances in order to define the 
K al m a n filter numerically. One way to proceed is to assume that the probability density 
function(p.d.f) for each of the six accelerations is uniform and equal to p(u)=l/2M, 
between limits +M and -M. The variance then is (j! = M 2 /3 • 


Therefore, the six variances for the system noise covariance matrix are: 

g\=g\=g\=T 2 g\ 
g>gUY 2 

g\=g 2 jX 2 

g>g 2 /R 2 


.(12) 


where X\Y’ is the average X, Y coordinates , R is the radial distance of the image 
corresponding points, all taken with respect to the center of the object 

One other covariance matrix must be determined. This is the “error covariance 
matrix”, P. It is the mean-square errors of all the estimates over their actual values. Its 
vector form is: 

P{k) = E[e{k) e T m 

For twelve signals, we have a 12x12 martix, thus: 


( 13 ) 



Eielm E[ ey {k)e z m 
Eielm 




P(k) = 


\E[e^k) ex m 


(14) 


The Kalman Filter And The Computational Process. 

A form of the Kalman Filter equations suitable for numerical computation is indicated 
below. 

Estimator: 

Jc(ifc) = <DJc (k - 1)H + K(k)[y(k) - - 1)] 

Filter -gain: 

where P, (*) = <W>(* - l)fl> T + fit* - 1) 

Error. . . cov ariance. . matrix: 

P(k) = P,(k)-K(k)m)P l (k\ 

To start Kalman processing we have to initialize die gain matrix K(k). For this 
purpose, the error covariance matrix P(k) has to be specified in some way. A reasonable 
inirialiyarirm can be established using a sequence of two cosecutive measurements, in this 
case, two consecutive images. This is meaningful in situations where actual initial deviation 
values are known.This will give six linear and angular deviations at time k=l, and six linear 
and an g ular deviations at time k=2. From these twelve measurement d ata , and using the 
group of deviations and accelerations equations provided earlier, estimates of all twelve 
parameters at time K=2, can be generated . From this, the error covariance matrix at time 
k=2 can be computed, seeing that 

P( 2) = E { [x(2) — x(2)] [x(2) — x(2)] r } 


20-12 



From P(2), the Kalman Predictor gain K(3) can be calculated, from which, estimates at 
time. k=3 can be generated. Such computational approach will lead to computation, in a 
sequential fashion, of estimates at times K=4, 5, 6,... .etc. 

An alternative initializati on approach is to initialize the error covariance matrix is by 
making it a diagonal matrix with variance values of 1. In essence, we overestimate the error 
covariance. Its only effect is to slow down the convergence of the filter. The filter itself 
builds up the covariance matrix, just as it buids up the gain. 
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